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Equilibrium free energy differences are given by exponential averages of nonequilibrium work 
' values; such averages, however, often converge poorly, as they are dominated by rare realizations. 

I I show that there is a simple and intuitively appealing description of these rare but dominant 

realizations. This description is expressed as a duality between "forward" and "reverse" processes, 
^ ' and provides both heuristic insights and quantitative estimates regarding the number of realizations 

needed for convergence of the exponential average. Analogous results apply to the equilibrium 
perturbation method of estimating free energy differences. The pedagogical example of a piston 
and gas [R.C. Lua and A.Y. Grosberg, J. Phys. Chem. B 109, 6805 (2005)] is used to illustrate the 
general discussion. 
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■ The nonequilibrium work theorem. 
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e-^"^) = e-^^^, (1) 



• relates the work performed on a system during a nonequilibrium process, to the free energy difference between two 
C/3 [ equilibrium states of that system. The angular brackets denote an average over an ensemble of realizations (repetitions) 
of a thermodynamic process, during which a system evolves in time as a control parameter A is varied from an initial 
value A to a final value B. W is the external work performed on the system during one realization; AF = Fb — Fa 
is the free energy difference between two equilibrium states of the system, corresponding to X — A and B; and /? is 
' the inverse temperature of a heat reservoir with which the system is equilibratedprior to the start of each realization 
^ of the process. A sample of derivations of Eq. [Jlcan be found in Refs. [H 0) S IJj S B i,t9, 10]; peda gogical and 
Q ■ review treatments are given in Refs. Ell ll^ lid 1 1 4 1 1 5l 1 1^ : for experimental tests of this and closely related results, 
O ' see Refs. [llP,'T^,'2q; finally, RefsHlIIllllllllf' discuss quantal versions of the theorem. 

In principle, Eq^ implies that AF can be estimated using nonequilibrim experiments or numerical simulations. If 
I ■ we repeat the thermodynamic process N times, and observe work values Wi, W2, ■ • ■ , Wn, then 
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where the approximation becomes an equality in the limit of infinitely many realizations, N — > 00. In practice, 
\ the average of e~^^ is often dominated by very rare realizations, leading to poor convergence with N. The aim of 
I this paper is develop an understanding of these rare but important realizations. I will argue that there is a simple 
> ■ description of these dominant realizations, which leads to both quantitative estimates and useful heuristic insights 
^ regarding the number of realizations needed for convergence of the average of e~^^ . 

^ ' The organization of this paper is as follows. In Section the central result is summarized, then illustrated using a 
^ I simple example. Section |n] contains a derivation of this result. Section Hill discusses the number of realizations needed 
^ ' for convergence of the exponential average. Section llVI focuses on the free energy perturbation method (a limiting 
Q \ case of Eq^, and Section M concludes with a brief discussion. 
O ' 



I. SUMMARY AND ILLUSTRATION OF CENTRAL RESULT 

The average in Eq^ can be written as 

(e'l^^^^ j dW p{W)e-l^'^ = j dWg{W), (3) 

where p{W) is the ensemble distribution of work values. Fig^ shows a schematic plot of this distribution, and of the 
integrand in EqEl g{W) = p{W)e~l^^. While p is peaked near the mean of the distribution, 

W = ( dW p{W)W, (4) 
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g is peaked around a lower value, W\ as the factor e has the effect of strongly weighting those work values that 
are in the far left tail of p. Explicitly, 



dW g{W) W, 



(5) 



where c = J dW g{W). As pointed out by Ritort in the context of a "trajectory thermodynamics" formalism |2^ . 
the average of e~^^ is dominated by the region near the peak of the integrand of Eq|3| i.e. near . I will use the 
term typical to refer to those realizations whose work values are near the peak of p, and dominant to refer to those for 
which the work is near the peak of g. Typical realizations are the ones that we ordinarily observe when carrying out 
the process, while dominant realizations are those rare realizations that contribute the greatest share to the average 
of e-'3^. 
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FIG. 1: During most realizations of the process, we observe work values near the peak of p{W). However, the average of e 
is dominated by realizations for which the work is observed to be in the region around the peak of g{W) — p(W)e~^^ . 



For the process discussed in the previous paragraph, the work parameter A is varied from A to B. Following 
Crooks , let us also consider a process during which A is manipulated from B to A, and let us use the terms forward 
(F) and reverse (R) to distinguish between the two processes. Specifically, if Af denotes the schedule for varying 
the work parameter during the forward process, from Xq = A to = B, then the reverse process is defined by the 
schedule 
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where r is the duration of either process. 

Note that the nonequilibrium work theorem applies to both processes: 
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where (• • • )p/j^, denotes an average over realizations of the forward/reverse process; and by convention AF = Fb — Fa 
in both equations. As with the forward process, it is useful to distinguish between typical realizations of the reverse 
process, and the dominant realizations that contribute the most to {e~^^)]i. 

Throughout this paper, the evolution of the system is modeled as a Hamiltonian trajectory in phase space, where the 
Hamiltonian is made time-dependent by externally varying the work parameter. (See Section[3for a brief discussion 
of other, e.g. stochastic, models.) In the absence of magnetic fields ^ ~ more precisely, under the assumption of 



^ When magnetic fields are present, the conjugate pairing of trajectories occurs if the reverse process is defined not only by Eq.|B] but also 
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time-reversal invariance, Ea ll3l - forward and reverse realizations come in conjugate pairs related by time-reversal, as 
illustrated in Fig|21 if a phase space trajectory Tf represents a possible realization of the forward process (a solution 
of Hamilton's equations when A is varied from A to B), then its conjugate twin, 

rf = rf:„ (8) 

represents a possible realization of the reverse process. The asterisk denotes a reversal of momenta: (q, p)* = (q, — p)- 
The trajectory depicts the sequence of events that we would observe, if we were to film the forward realization Ff 
and then run the movie backward. 
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FIG. 2: A conjugate pair of trajectories. The horizontal axis (q) represents the complete set of configurational coordinates 
(e.g. particle positions), while the vertical axis (p) represents the set of associated momenta. F denotes a point in this many- 
dimensional phase space; and Ff ''^ denotes a realization of the forward / reverse process, with time running from f = to 
t — T. The two trajectories are related by time-reversal: F^* — F:flj, where (q, p)* = (q, — p). In the notation of Section ITTl 
the upper curve is the trajectory 7^, the lower curve is the trajectory 7^. 

The central result of this paper, Eg. 1271 below, states that the dominant realizations of the forward process are the 
conjugate twins of typical realizations of the reverse process, and vice versa. Thus, the trajectories that contribute 
the most to {e~^^)F, are those during which the behavior of the system appears as though we had filmed a typical 
realization of the reverse process, and then run the movie backward. The existence of such a duality was anticipated 
by Ritort, who observed that for large systems the work performed during a dominant realization of one process is 
(minus) the work performed during a typical realization of the conjugate process; see comments following Eq. 58 of 
Ref. 

As an illustration of this result, consider an ideal gas of (mutually non-interacting) point particles inside a box 
closed off at one end by a piston, and imagine that we act on t he g as by pulling the piston outward. In the context of 
Eq^ this system has recently been studied by several groups |27ll2M l29ll30 f . Of particular relevance to the present 
paper is the analysis of Lua and Grosberg '27t|, who showed by explicit calculation that, in the fast piston limit. 



by a change in the signs of these fields 01 . The central result of this paper then remains valid. For simplicity of presentation, however, 
I will restrict myself to the time-reversal-invariant situation, Eq. 1131 
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the dominant realizations are characterized by particles with initial velocities sampled from deep within the tail of a 
Maxwellian distribution; and for the reverse process, when the piston is pushed into the gas, the dominant realizations 
are those for which there are no particle-piston collisions |29]. These conclusions are consistent with the discussion 
below. 

Let rip 3> 1 be the total number of particles, each of mass m. Imagine that we begin with the piston at a location 
A, corresponding to a box of length L, and we prepare the gas in canonical equilibrium at temperature T, i.e. with 
particle velocities sampled independently from a Maxwellian distribution. Now we rapidly pull the piston outward, 
from A to B, over a time r and at constant speed u = L/t, thus increasing the length of the box from L to 2L (Fig|3^). 
Since the volume of the box is doubled, the free energy difference between the equilibrium states corresponding to the 
piston locations A and B is 

AF = -np/3"Mn2. (9) 

During a realization of this process, whenever a particle collides with the moving piston, the particle suffers a change 
of kinetic energy, SK = —2'mu{vx — u) < 0, where Vx is the component of the particle's velocity parallel to the motion 
of the piston, prior to the collision. The total work W is the sum of such contributions. 

If the process described above is the forward process, then the reverse process involves pushing the piston into 
the gas at speed m, from B to A (Fig0Ji), starting from an initial state of thermal equilibrium. Each particle-piston 
collision now produces a change 5K = 2mu(vx+u) > in the particle's kinetic energy. In the context of the piston and 
gas example, I will use the terms expansion and compression to denote the forward and reverse processes, respectively. 

Now suppose the piston speed is much greater than the thermal particle speed: 

u > -yth = v/3/m/3. (10) 

In this case, the particle density profile typically changes very little during expansion (Fig|21i): most particles remain 
in the left half of the box, and few if any collide with the piston, consequently 

kO. (11) 

(The superscript indicates the forward process, i.e. expansion.) 

During the compression process (Fig^Ji), the piston typically collides with about half of the gas particles - roughly 
speaking, those initially located in right half of the box. This generates a shock wave of particles streaming leftward 
at approximately twice the piston speed. At time r the front of this wave reaches the wall at a; = 0, just as the 
piston arrives at A. Thus at the end of such a realization, half the particles (those untouched by the piston) are 
characterized by the original Maxwellian velocity distribution, while the other half have velocities with Vx ~ —2u. 
For such a realization, 

~ !^ . 2mu^ = Upmu^. (12) 

If Fig.QJ^ illustrates the trajectory just described, then FigEja illustrates its conjugate twin. Here the piston begins 
at A, with half the particles streaming rightward at Vx ~ +2u. As the piston moves from A to B at speed u, 
each of these very fast particles collides once with the piston, losing most of its kinetic energy. At the moment the 
piston reaches B, the container is uniformly filled with a gas characterized by a Maxwellian velocity distribution at 
temperature T. Needless to say, this "anti-shock" wave represents an exotic sequence of events! When Up ^ 1 and 
u ^ Vthi the probability of sampling an initial microstate for which half the particles have Vx ~ +2u is fantastically 
small. However, according to the central result of this paper, rare realizations of this sort are precisely the dominant 
ones that contribute most to (e"^^)^- 

Similarly, the dominant realizations of the compression process are the conjugate twins of typical realizations of the 
expansion process. Thus to achieve convergence of (e~^^)ij, we must observe realizations during which the bulk of 
the gas happens to be localized in the left half of the box at t = 0. Then, as the piston moves rapidly from B to A, 
it sweeps through a largely empty region, as in FiglJjD. Again, this represents an unusual scenario: it is unlikely that 
the randomly sampled initial conditions are such that virtually all the particles are found in the left half of the box. 

To summarize, while a typical realization of the expansion process is characterized by no piston-particle collisions, 
and a typical realization of the compression process is characterized by ^ np/2 collisions (Figs.|3^,0Ji), for dominant 
realizations it is the other way around (Figs, ^js, |3Jd). 

II. DERIVATION OF CENTRAL RESULT 

Consider a classical system described by a Hamiltonian H{T; A), where F = (q, p) denotes a point in the system's 
phase space (a microstate), and A is an externally controlled parameter, such as the piston location in the above 
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(a) (b) 



FIG. 3; (a) A typical realization of the forward process. Due to the great speed of the piston fEg. llOL most particles remain 
in the left half the box for the duration of the process, (b) The conjugate twin of the realization depicted in (a). Almost all 
particles begin in the left half of the box, and the piston then moves rapidly through a largely empty region. (The vertical gray 
arrows specify the direction of increasing time.) 



example. For every value of A, assume H is time-reversal invariant: 

H[T*-\)^H[T-\) , r* = (q,-p). (13) 

This system can be prepared in an equilibrium state, by placing it in weak thermal contact with a sufficiently large 
heat reservoir at temperature T , holding the parameter fixed at a value A, and then removing the reservoir after a 
sufficiently long relaxation time. This generates a microstate Fg that is a random sample from the Boltzmann-Gibbs 
distribution, 

PA(Fo) = ^exp[-/?iJ(Fo;A)], (14) 

where Z\ — J dT exp[— /3i7(F; A)] is the partition function. The free energy associated with this equilibrium state is 

Fx = -l3-^\nZx. (15) 

In the analysis and discussions below, the dependence of p\, Zx, and Fx on temperature will be left implicit, and 
equilibrium states will be identified by the parameter value A. 
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FIG. 4: (a) A typical realization of the reverse process. As the piston moves rapidly into the gas, the particles with which it 
collides gain large components of velocity (~ 2m) along the direction of motion of the piston. The particles with the attached 
arrows are meant to represent these fast particles, while the unadorned ones are characterized by thermal velocities, (b) The 
conjugate twin of the realization depicted in (a) . Ifalf the particles are initially moving at great speeds (~ 2u) in the direction 
of the piston. By the end of the process, after each of these has collided with the piston, the velocity distribution of the entire 
gas is thermal. 



To perform the forward process (F), we first prepare the system in the equilibrium state A, then we remove the 
heat reservoir. Then, from t = to t — t we let the system evolve under Hamilton's equations as we vary A from A 
to B according to a pre-determined schedule, Af . Let Hf (T) — H(r;Xf) denote the time-dependent Hamiltonian 
obtained when A is varied in this manner, and let 7^ = [^f]o denote a phase space trajectory evolving under this 
Hamiltonian. The notation indicates that the trajectory 7^ passes through the set of points , for < i < r. Such 
a trajectory describes the microscopic history of the system during a single realization of the forward process. 

By repeating this process infinitely many times, we generate a sequence of trajectories, {j[ , J2 ' ' ' '}■ These can 
be viewed as random samples from a probability distribution 'P^[7^], defined on the set of all possible trajectories 
generated by Hf. Because the dynamics are deterministic, the probability of observing a given trajectory is simply 
that of sampling its initial conditions from a canonical distribution Q : 

r^h^] ^PaK) = ^cxp[-/3i7(r^;A)]. (16) 

Since the system is thermally isolated (not in contact with a heat reservoir) as A is varied from A to B, the work 



7 



performed on the system is equal to the net change in its energy: 

W^[j'']^H{T^-B)-H{r^;A). (17) 

Similar remarks and notation apply to the reverse process (R). The system is prepared in equilibrium state B, then 
A is varied from B to A (Eq. EJ, generating a Hamiltonian trajectory — [F^JJ, with probability 

rV] = PBirS) = ^ exp[-/3i7(ro«; B)\. (18) 

The work performed is 

W^^[7«] =.i7(rf;A)-iJ(r?;i?). (19) 

As mentioned in Section HI every forward trajectory 7-'^ has a conjugate twin (Fig. that is a solution of 
Hamilton's equations when the parameter is varied according to the reverse schedule. In the remainder of this paper, 
whenever and 7^ appear together (e.g. in the same equation or sentence) it will be understood that these two 
trajectories form such a conjugate pair. 

From Eqs|T2|[T7|and[Tl we get 

W^Yi^\^~W\-i^\ ■ (20) 

the work during a forward realization is the opposite of that during its conjugate twin. In what follows it will be 
convenient to deal with dissipated work values 

Wii^^] ^ ly^b^-AF (21a) 
W^[j^] = W^ij"]+AF=-W[[j^]. (21b) 

(The term "dissipated work" here means the amount by which the work W exceeds that which would have been 
performed, had the process been carried out reversibly and isothermally.) In terms of these quantities, Eq.dbecomes 



= 1. (22) 

We now have the elements in place to investigate the nature of those realizations that dominate the average {e~^^) p, 
or, equivalently, {e~^^'^)F- Combining EaslSl [T51 115I18I we obtain a simple relationship between the probability of 
observing a trajectory 7^ during the forward process, and that of observing its twin 7^ during the reverse: 



{Pw[h^]) = exp(-/3iyi^[7^]) . (23) 



This result, originally obtained by Crooks for stochastic, Markovian dynamics 2], has here been derived in the context 
of Hamiltonian evolution. 

(In Eas ll6l and ll8l as in Ref. 0, the Liouville measure on initial conditions in phase space has implicitly been used 
to define a measure on the space of trajectories: the "volume" of trajectory space, ^7-'^, associated with a collection 
of forward trajectories, is taken to be the phase space volume dT^ occupied by their initial conditions. Liouville's 
theorem then implies that the volume occupied by a given set of forward trajectories is equal to that of the conjugate 
set of reverse trajectories. In this sense, the numerator and denominator in Eg 1231 are defined with respect to the 
same measure on trajectory space.) 

Let us now write {e~^^'')F as an integral over forward trajectories: 

1 = {e-^"^')^ = J dj^V^h^]e-^'^"^''"^ = J d7^ Q^[7^], (24) 

where = V^e-'^^-' . Let Ctyp and Ci'om denote the regions of trajectory space where and , respectively, are 
peaked, as illustrated in Fig[Sl Thus, while ^^^p contains the typical forward realizations, Cd^j-^ contains the dominant 
ones, since the greatest contribution in Ea l24l comes from the peak region of . 
For the reverse process, we have 



1 = {e-^'^A = / d7"T"[7"] e-'"^^''^"' = / d-," Q"[j"], (25) 
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FIG. 5: The upper box represents the space of aU forward trajectories, the lower box the space of reverse trajectories, and in 
this visual depiction conjugate pairing is indicated by reflection about a horizontal line between the two boxes (e.g. the two 
crosses represent a pair of conjugate twins). The grey regions C^t^p and (^^p denote the peaks of the probability distributions 
and P^, while the vertically striped regions C^am ^-nd Cdom ^re the peaks of the functions and Q^. The dashed arrows 
indicate the conjugate pairing that is the central result of this paper (Eg. I27II . 



and we define regions Ctyp and Cdom where and are peaked. 
Combining Eq[23|with the definitions of and Q^, we get 

Q^[7^] = T'^b^] (26a) 
Q^[7^] = T'^b^]. (26b) 

Eg. I26al states that the function is the conjugate image of the distribution V^; thus if we plot in the lower 
box of Fig. |31 then its mirror image in the upper box is . It follows that the trajectories in Cd^Dm (^^"^ peak region 
of Q^) are the conjugate twins of those in C^yp (the peak region of V^): 

Cdom ^ Ctyp J (27a) 

where the symbol ^ indicates a correspondence through conjugate pairing of trajectories. This is illustrated by the 
pair of circles in Fig. |S1 depicting the peak regions of and V^. Similarly, Ea. l26bl gives us 

CLn ^ Ctyp, (27b) 

as illustrated by the ellipses in Fig. [S] Thus the trajectories typically observed during the reverse process are the 
conjugate twins of those that dominate {e~^^) for the forward process fEa l27a|) . and vice- versa fEa l27bp . This is the 
central result of this paper. 
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Let us consider for a moment the special case of a cyclic process, for which the final value of the work parameter 
is the same as the initial value: Xq = A — B = . In this situation AF = 0, identically, and Eq. ^becomes 

(e-'^^)^ = 1, (28) 

a result originally derived by Bochkov and Kuzovlev [sij . Under the additional assumption of a time-symmetric 
schedule, Af = Af_j, the forward and reverse processes are identical. Thus, for processes that are time-symmetric 
(and therefore also cyclic), there is no distinction between "forward" and "reverse", and Eq. |57|is particularly easy 
to state: the exponential average is dominated by the conjugate twins of typical realizations. 

In the analysis leading to Eq. |27| it has implicitly been assumed that the distributions V^Ij^] and V^l'j^] are 
sharply peaked, i.e. that each process is characterized by well-defined "typical behavior" . This assumption is often 
reasonable for systems with many degrees of freedom (see e.g. Figs. 0Ji). It is useful, however, to formalize and 
generalize the discussion, so as to avoid reliance on the notion of typicality. 

Let iS"^ denote an arbitrary set of forward trajectories, and define 

r!^{5^}EE / dj^V^ij^] , ^^{S''}^ f d7^Q^[7^]. (29) 

is the probability of obtaining a trajectory in , when carrying out the forward process; I will refer to this as the 
statistical weight of the set in the ensemble of forward trajectories. In turn, ^'^ provides a measure of the relative 
contribution of to the average {e~^^)F (see Eg. I24|I . Note that fl^ — '^^ = 1 when includes all forward 
trajectories. Define analogous quantities and ^'^ for a set of reverse trajectories S^. Now take these two sets to 
be related by conjugate pairing: is an arbitrary region in the upper box of Fig. |5l and is its mirror image in 
the lower box. Using Eql^Hlwe then get 

^^{S^} = n"{S"}. (30) 

In words: the relative contribution of a set of forward trajectories to (e~^^)i?, is equal to the statistical weight of the 
conjugate set of reverse trajectories. Of course, the converse is true as well: '^/^{S^} = ^■'^{5'^}. 

Eq.Elis a special case of EqlJOl if Ctyp contains 95% of the statistical weight of all reverse trajectories (a reasonable 
definition of the peak region of P^), then the conjugate set of forward trajectories provides 95% of the contribution 
to (e-/'^)F. 

As an illustration of Eq. 1201 using the piston-and-gas example, consider the set of all forward realization for 
which there are exactly n piston-particle collisions, and the conjugate set of reverse trajectories. Then 

^^{sfA = n^{s:^}, (31) 

i.e. the relative contribution of ?i-collision realizations to {e~^^)p, is exactly the probability of observing n collisions 
when performing the reverse process, for any n > 0. Since ^1^{S^} is peaked around n w np/2 (during compression 
we almost always observe roughly np/2 collisions, Fig.^), it follows that the greatest contribution to (e~^'^) p comes 
from realizations of the expansion process for which n « np/2 (Fig.^Jj). 



III. NUMBER OF REALIZATIONS NEEDED FOR CONVERGENCE 

When using Eq|21 to evaluate AF, how many realizations do we need to obtain a reasonable estimate? While a 
precise answer depends on the desired accuracy, a back-of-the-envelope estimate can be derived as follows. 

Imagine that we repeatedly carry out the forward process, either in a laboratory experiment or using numerical 
simulations. In doing so we generate trajectories {7^, 72^, • ■ • } sampled from P^[-f^]. Since the most important 
contribution to {e~^^)p comes from the region Cd^m' must sample this region to get a decent estimate of the 
average. The probability that a single randomly sampled trajectory falls within C(to„i is 

P = J, d^''V^h^]= j ^ d7^7'^[7^]exp(-/3IFi^[7^]) (32) 

^ exp(-(3Wf) f d^''P^[j'']^exp(-(3W^). (33) 
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Here is the average work dissipated when performing the reverse process. On the first hne we have used EqsESI 
and I27al and on the second we have used the fact that the trajectories 7^ G Ct^p constitute most of the probabihty 
distribution . Thus an estimate of the number of reaUzations needed to obtain a single trajectory in Cd^j,, is: 

iVf = P- 1 - exp (/3 ) • (34a) 

The same argument gives us the expected number of reverse reahzations needed to obtain a trajectory in Cd^j^: 

iVf ^exp(/3Wj). (34b) 

These results suggest that the number of realizations required for convergence grows exponentially in the average 
dissipated work, in agreement with the findings of Gore et al. [3^. and therefore exponentially with system size 
(assuming dissipated work is an extensive property), as concluded by Lua and Grosberg Interestingly, however, 
it is the average amount of work dissipated during the reverse process that determines the convergence of {e~^^) for 
the forward process fEa l34a|) . and vice-versa fEa l34b|l . This implies that, of the two processes, the more dissipative 
one is the one for which (e~^^) converges more rapidly. We can understand this counterintuitive conclusion with the 

following plausibility argument. Fig. depicts the work distributions p^{W) and p^{—W), when W > W ^: the 
mean of is displaced farther to the right of AF than the mean of p^ is to the left of A_F, which in turn suggests 
that p^ is wider than p^, since the two distributions cross exactly at = A_F |23|. Thus, as measured in standard 
deviations, we must reach deeper into the tail of p^ to sample the peak region of p^, than the other way around. 

Hence ISSf- > when > W ^ . This prediction agrees well with recent numerical simulations of an asymmetric 
object dragged through a hard-disk gas |3^. 




FIG. 6: Distributions of work values when VF^ > W ^ . Since the two distributions cross at IF = A_F, is wider than p^. 
Thus, work values near — IF^ are more frequently sampled from p^(W), than those near IF^ are sampled from p^( — IF). 

The piston-and-gas example provides a nice illustration of Eq.|^ From SectionQlwe have IF « 0, IF « ripmu^, 
and AF = -UpP'^ In 2, thus 

f^np(3-^\n2 , - np{mu^ - f]'^ \n2). (35) 

Hence IF^; ^ IF^ in the fast piston limit fEa. llO() : rapid compression is much more dissipative than rapid expansion. 

Let us now analyze the convergence of {e~^^)ii- The dominant realizations are those for which essentially all the 
gas particles begin in the left half of the box (FiglSti). The probability of generating such an initial condition is 
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(l/2)"j', hence the number of reahzations needed to observe a single such event is 

iVi^-2"^ (36) 

which is equal to eyip^pW ^ ), as predicted by Ea l34bl 

For the convergence of {e~^^)F: the dominant realizations are represented by Fig^jD: initially, half the particles 
are characterized by a thermal velocity profile, the other half with « 2u. The probability of generating such a 
microstate is roughly 

(n,/2)!(n,/2)! " ^ ' 

The first factor counts the number of ways of choosing which np/2 particles start with Vx ~ 2m, and a = 
exp[— /3m(2u)^/2] is an estimate of the probability for a single particle to have a such a large initial Vx- Using 
Stirling's approximation for the factorials, Eg 1571 reduces to 2"" • exp[—npf3mu'^). Taking the reciprocal, we get 

TVf -exp[np(/3mu2-ln2)], (38) 

which is equal to exp(/3iy^), as predicted by Ea l34al Since the fast piston limit fEa. I10|) implies fimv? ^ 1, it is 
legitimate to drop the In 2 term in this result, obtaining ^ exp{npl3mu^). This is consistent with the calculations 
of Lua and Grosberg, who obtain exp(/?TOL^/T^) when Up = 1 (see section 4 of Ref. |27|V 

These results verify that the more dissipative process (compression) is the one for which the exponential average 
converges more rapidly. We can understand why this is the case, without performing explicit quantitative estimates. 
For the compression process, as mentioned, a dominant realization begins with all the particles in the left half of 
the box, thus Up particles must simultaneously satisfy a condition that is not so unusual for any given particle. For 
the expansion process, on the other hand, half the particles must begin with Vx ~ 2u, therefore np/2 particles must 
simultaneously satisfy a condition that is very unusual for even a single particle (since u 3> Vth)- The latter situation 
is the less likely by far. Of course, when rip 3> 1, both and are extremely large [3^ . 

In practice, the convergence difficulties associated with Eq. Q are mitigated somewhat if we have data for both 
the forward and the reverse processes. In that case, Bennett's acceptance ratio method [23, H^l converges 

faster than a direct exponential average of either the forward or reverse work values. It would be useful to derive 
a simple estimate, analogous to Eq. 1341 of the number of realizations required for this method to converge, and to 
develop heuristic insight regarding the realizations that make the most important contribution to the acceptance ratio 
estimate of AF. 

Finally, recall that a measure of the difference between two normalized distributions, /i and /o, is given by the 
relative entropy [13 



D[.fi\M = f /iln^ >0, 

J JO 



(39) 



where the integral is over the space of variables on which /o and /i are defined. Applying this definition to the forward 
and reverse distributions of trajectories, and identifying 7^ with its twin 7^, we get 

DiV^W''] = I d7^7'^[7^]ln^^ =/3Trr, (40a) 

using Ea l23l and similarly 

D[V^\V^] ^ /3Wa . (40b) 

Analogous identities have been derived for the steady states of Markov chains [l^, and for the work distributions 
arising in the context of free energy estimation ; and the physical significance of relative entropy for equilibrium 
and nonequilibrium fluctuations has recently been discussed in Ref. Eq.QHlsuggests that there might be a natural 
information-theoretic interpretation of Eq. 1341 

IV. FREE ENERGY PERTURBATION 

The perturbation method for estimating free energy differences is based on the identity 

/ A 
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where AH(T) = H{T;B) — H{r;A), and {■ ■ ■)a denotes an average over microstates F sampled from the canonical 
ensemble A 0,1121 • [In this section, I explicitly assume that the Hamiltonians H{A) and H{B) are finite- valued 
throughout phase space. This precludes situations such as the piston-and-gas example, particles with perfectly hard 
cores, etc.] The convergence problems that arise with the nonequilibrium work theorem also plague the free energy 
perturbation identity 43] . In the case of Eg 1411 we can frame this issue by considering the typical microstates sampled 
from the equilibrium ensemble A, and the rare but dominant microstates that contribute most to the average of e^^^^ . 
In what follows I discuss how the ideas developed in earlier sections of this paper apply to Eq. FTTI 

The left side of Eq. 1411 is most naturally viewed as an equilibrium average, as described above. An alternative 
perspective, however, treats Eq. E]as a special case of Eq. ^ obtained in the "instantaneous switching" limit, r ^ 0, 
in which A is changed suddenly from A to B 0. The system then has no opportunity to evolve during the process, 
hence a realization is described not by a trajectory, but rather by a single microstate (sampled from A). The work 
W is given by AH, evaluated at this microstate. If we view EqE]as pertaining to a forward perturbation {A B), 
then the reverse perturbation involves sampling from equilibrium state B: 

e+/3AH\ =g+/3AF^ (42) 
/ B 

where AH and AF are defined identically in Easl41l and 1421 
By analogy with Eq[^we can rewrite EqE]as follows: 

e-'"^^)^ - I drp^(r)e-^^^"(n ^ I dTq^n (43) 

where pA is the equilibrium distribution for state A fEa ll4|l . and = AH — AF. Similarly rewriting Eg 1421 fwith 
= —AH + AF), let us now define ^^p, ^^q^, £,^p, and ^^^^^ as the regions of phase space where pA, Qa, Pb, and 
qb, respectively, are peaked. These contain the typical and dominant microstates for the two perturbations. Eg. 1271 
now becomes 

^dom ^typ I ^dom ^typ" (44) 

Thus, when implementing the free energy perturbation method, by sampling from one distribution (say, A), the 
collection of sampled microstates must be large enough to include a reasonable number that are typical of the other 
distribution (B), otherwise we will not achieve convergence of the exponential average. If there is very little overlap 
between the two equilibrium distributions in phase space, then the number of samples required to satisfy this condition 
is prohibitively large p[2| . 

Lower bounds and on the required numbers of realizations are given by analogues of Eq. 1341 

InNf ~ pWa = D[pb\pa] > (45a) 
InTVf ~ piW^ = D[pa\pb] > 0, (45b) 



where the overbars now denote canonical averages with respect to the ensembles A and B. Of the two perturbations, 
the one with larger Wd requires fewer samples for convergence of the exponential average. 

These results are illustrated by Widom's particle insertion method for computing a chemical potential [T^ . 
Imagine a fluid of iV + 1 particles, and suppose that H{A) describes the situation in which N of the particles interact 
with one another through a pairwise potential, while the remaining, "tagged" particle is uncoupled from the rest; 
and H{B) describes the situation in which all + 1 particles are mutually coupled via the pairwise potential. The 
free energy difference AF = Fb — Fa is the excess chemical potential of the fluid, provided N is large enough to 
recover bulk properties. In principle, we can estimate AF either by using the forward perturbation, A —> B (particle 
insertion), or with the reverse perturbation, B ^ A (deletion). 

In a dense fluid, particle insertion usually generates a very large value of AH, while for deletion AH is typically 

B 

modest. Thus > , and EqEI predicts that the insertion method converges more rapidly than the deletion 
method, as indeed observed empirically 12]. To understand this in terms of typical and dominant microstates, note 
that when sampling from ensemble B, the (interacting) tagged particle typically occupies its own small volume within 
the fluid, from which the remaining particles are excluded. Thus, by Eg . 1441 to achieve convergence in Ea l41l we must 
sample sufficiently from ensemble A to obtain microstates in which the non-interacting, tagged particle happens to 
sit inside a cavity created by the spontaneous fluctuations of the remaining TV-particle fluid. This condition carries 
an entropic cost roughly equal to the free energy of forming a suitably large cavity. 

Conversely, when sampling from ensemble A the (non-interacting) tagged particle is typically found within the 
repulsive core of one of the other particles, hence to succeed with the particle deletion method, we must generate such 
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microstates when sampling from ensemble B. This carries an enthalpic (energetic) cost, determined by the strength 
of the repulsive core of the pairwise potential. Because the repulsive cores are generally described by very steep 
potentials, this enthalpic penalty is much larger than the entropic penalty described above: thermal fluctuations are 
much more likely to generate a cavity large enough to accommodate a new particle, than they are to squeeze two 
particles into a volume meant only for one. Hence the dominant realizations of the forward perturbation (insertion), 
are much less rare than those of the reverse perturbation (deletion). 



V. DISCUSSION 



The central result of this paper, Eq. 1271 is a duality that relates the dominant realizations of a given process to 
typical realizations of the conjugate process. In a nutshell, it states that to achieve convergence in Eq. ^ we must 
observe realizations during which the system appears as though it is evolving backward in time. I will now sketch 
an interpretation of this result similar to the discussion of causal and anti-causal response theory found in Ref. |45l |. 
Following that, I will briefly discuss the validity of Ea. l27l in situations involving non-Haniiltonian (including stochastic) 
equations of motion. 

Let us picture the ensemble of forward trajectories, {^1,^2^ ' ' ' }: 'ds a. swarm of points evolving independently in 
phase space (from t — to t — t), and let p^{T,t) = {S{T — Ff )) denote the corresponding time-dependent phase 
space distribution. This distribution obeys the Liouville equation, 

dp'' _ dHl_ dp^ _ dHl_ djf_ 
dt 9q 9p 9p 9q 

where Hf = _ff (F; ). The assumption of initial equilibrimn is a boundary condition imposed at t = 0: 

p^(F,0) = ^exp[-/5i7(F;A)]. (47) 

Now consider a distribution g^(F,i) evolving under the same dynamics. Eg 1461 but satisfying a boundary condition 
ai t ~ T (rather than sX t = Q): 

g^(F, r) = ^ exp[-/3i7(F; B)]. (48) 

This distribution describes an ensemble that ends, rather than begins, in a state of thermal equilibrium. In the 
language of Ref. [45|. p^ corresponds to a causal ensemble of trajectories, determined by initial conditions, while 
is anti-causal, determined by final conditions. The central result of this paper can now be restated as follows: while 
the typical causal trajectories are the ones we ordinarily observe, the typical anti-causal trajectories are the ones that 
dominate the exponential average. This follows from the simple observation that the anti-causal ensemble of forward 
trajectories is just the conjugate image of the causal ensemble of reverse trajectories. 

In the context of linear response theory, Evans and Searles {43[ have shown that anti-causal ensembles give rise to 
Green-Kubo "anti-transport" coefficients. In an earlier theoretical study of dilute gases, Cohen and Berlin |4^] derived 
an anti-causal version of the Boltzmann equation, by applying the assumption of molecular chaos to future rather than 
past pair distribution functions. In both papers the anti-causal behavior is associated with violations of the second law 
of thermodynamics: the Green-Kubo coefficients of Ref. ^4^] have the "wrong" signs, and the Boltzmann-like equation 
of Ref. obeys an anti-H theorem. The situation is similar here: anti-causal ensembles are associated with negative 
average values of dissipated work; the second law of thermodynamics, by contrast, asserts that irreversible processes 
are accompanied by positive average dissipated work. 

Eq. 1231 provides an amusing connection between the conjugate pairing of trajectories (microscopic reversibility) 
and the second law of thermodynamics (macroscopic irreversibility), illustrated in the piston-and-gas context by the 
following thought experiment. Imagine that we are shown a movie in which we see the microscopic evolution of 
the gas as the piston moves outward from A to B, and we are asked to guess whether this movie depicts an actual 
realization of the expansion process, or whether, instead, a realization of the compression process was filmed, and now 
that movie is being run backward. I will refer to this as "guessing the direction of time" . To analyze this situation 
quantitatively, let 7^ specify the microscopic evolution we observe when watching this movie, and 7^ its conjugate 
twin. Then the task of guessing the direction of time is an exercise in statistical inference, in which we compare the 
likelihoods of two hypotheses. Specifically, we ask whether it is more likely that we obtain 7^ when performing the 
forward process, or 7^ when performing the reverse process. By Eql^Hl the ratio of these likelihoods is exp{pwf). 
Hence if > 0, we opt for the first hypothesis, namely that the piston was indeed withdrawn from A to B; whereas 
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if < (equivalently > 0) then we guess that the piston was pushed into the gas, and we are seeing a movie 
of that process in time-reversed order. Thus when asked to guess the arrow of time, we optimize our answer simply 
by insisting that the sign of the dissipated work be positive, in agreement with the second law. 

It is natural to use Hamilton's equations to describe a thermally isolated classical system, as in this paper. If the 
system is in contact with a heat reservoir, however, then there are various ways to model its evolution. The most 
intuitively natural is to treat the combined system and reservoir as a very large, Hamiltonian system. Alternatively, 
one can describe the evolution of the system itself using stochastic equations of motion, such as the Metropolis Monte 
Carlo algorithm, or Langevin dynamics. Yet another approach involves deterministic but non-Hamiltonian equations 
of motion, such as Gaussian thermostats, designed to mock up the presence of a heat reservoir. Eq^Jhas been derived 
for all these cases, so it is natural to ask whether the central result of the present paper, Eq. also holds for these 
various schemes. 

Since the heart of the argument in S ect ion ITU follows from Eq.[22l two conditions are sufficient for the central results 
of this paper to hold for any given one of the above-mentioned schemes. First, there must exist a conjugate pairing of 
forward and reverse trajectories. Second, the probability of observing a particular trajectory 7^ during the forward 
process, and that of observing its twin 7^ during the reverse process, must satisfy Eq. 1231 These conditions have 
been verified explicitly when the system evolves under a discrete-time Monte Carlo scheme satisfying detailed balance 
(such as the Metropolis algorithm) Q or under Langevin dynamics |^ 49] . When the evolution of the system 
is modeled with isokinetic Gaussian equations of motion, the validity of Eg. 1231 follows from the analysis of Ref. 0- 
Thus Eq. [23of the present paper applies when these schemes are used to model the evolution of the system. 

When the system and reservoir are treated together as a very large, Hamiltonian system, then Eq. [231 can be 
derived by repeating the steps of Section^ but working in the full phase space containing all the interacting degrees 
of freedom, and then projecting out the reservoir degrees of freedom. The analysis becomes slightly complicated if 
the coupling between the system and reservoir is not negligible, but this technical issue is handled much as in Ref. 
(see, however, Refs. [50,|51|), and the details will not be presented here. 

Finally, the nonequilibrium work theorem is just one of a number of (mostly recent) predictions concerning the 
statistical mechanics of systems far from thermal equilibrium. Others include the Kawasaki identity "52] and its 
generalization by Morriss and Evans [SSj , the fluctuation theorem .5 4^ ,55 - ^56^ „57,. 58,, 59., .60., .61] , Hatano and Sasa's 
equality for transitions between nonequilibrium steady states |62l. 1631 [6 ^ , and Adib's microcanonical version of 
Eq. ^ |65| . It remains to be investigated whether the analysis of the present paper is valid (and relevant) in the 
context of these other, closely related results. 

It is a pleasure to acknowledge useful correspondence with Alexander Grosberg and Rhonald Lua. This work was 
supported by the United States Department of Energy, under contract W-7405-ENG-36. 
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